Skip to content

[Headers][X86] Allow AVX512 _mm512_set* intrinsics to be used in constexpr #152910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 11, 2025

Conversation

markbhasawut
Copy link
Contributor

This PR adds constexpr support for the following AVX512F/BW/FP16 set intrinsics:

  • _mm512_set4_pd

  • _mm512_set4_ps

  • _mm512_set4_epi32

  • _mm512_set4_epi64

  • _mm512_set_pd

  • _mm512_set_ps

  • _mm512_set_epi8

  • _mm512_set_epi16

  • _mm512_set_epi32

  • _mm512_set_epi64

  • _mm512_setr_pd

  • _mm512_setr_ps

  • _mm512_setr_epi32

  • _mm512_setr_epi64

  • _mm512_set1_ph

  • _mm512_set_ph

  • _mm512_setr_ph

  • _mm_setzero_ph

  • _mm256_setzero_ph

  • _mm512_setzero_ph

Closes #152288.
Part of #30794.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@markbhasawut markbhasawut force-pushed the x86-constexpr-avx512-set branch from 4965bac to f1baf0d Compare August 10, 2025 20:33
@markbhasawut markbhasawut marked this pull request as ready for review August 10, 2025 20:34
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Aug 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 10, 2025

@llvm/pr-subscribers-clang

Author: Bhasawut Singhaphan (markbhasawut)

Changes

This PR adds constexpr support for the following AVX512F/BW/FP16 set intrinsics:

  • _mm512_set4_pd

  • _mm512_set4_ps

  • _mm512_set4_epi32

  • _mm512_set4_epi64

  • _mm512_set_pd

  • _mm512_set_ps

  • _mm512_set_epi8

  • _mm512_set_epi16

  • _mm512_set_epi32

  • _mm512_set_epi64

  • _mm512_setr_pd

  • _mm512_setr_ps

  • _mm512_setr_epi32

  • _mm512_setr_epi64

  • _mm512_set1_ph

  • _mm512_set_ph

  • _mm512_setr_ph

  • _mm_setzero_ph

  • _mm256_setzero_ph

  • _mm512_setzero_ph

Closes #152288.
Part of #30794.


Patch is 21.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/152910.diff

4 Files Affected:

  • (modified) clang/lib/Headers/avx512fintrin.h (+40-54)
  • (modified) clang/lib/Headers/avx512fp16intrin.h (+21-5)
  • (modified) clang/test/CodeGen/X86/avx512f-builtins.c (+15-1)
  • (modified) clang/test/CodeGen/X86/avx512fp16-builtins.c (+20)
diff --git a/clang/lib/Headers/avx512fintrin.h b/clang/lib/Headers/avx512fintrin.h
index 95b80cc59bb02..327a044f369ce 100644
--- a/clang/lib/Headers/avx512fintrin.h
+++ b/clang/lib/Headers/avx512fintrin.h
@@ -351,32 +351,27 @@ _mm512_broadcastss_ps(__m128 __A)
                                          0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set4_epi32 (int __A, int __B, int __C, int __D)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_epi32(int __A, int __B, int __C, int __D) {
   return __extension__ (__m512i)(__v16si)
    { __D, __C, __B, __A, __D, __C, __B, __A,
      __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set4_epi64 (long long __A, long long __B, long long __C,
-       long long __D)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_epi64(long long __A, long long __B, long long __C, long long __D) {
   return __extension__ (__m512i) (__v8di)
    { __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512d __DEFAULT_FN_ATTRS512
-_mm512_set4_pd (double __A, double __B, double __C, double __D)
-{
+static __inline __m512d __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_pd(double __A, double __B, double __C, double __D) {
   return __extension__ (__m512d)
    { __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512 __DEFAULT_FN_ATTRS512
-_mm512_set4_ps (float __A, float __B, float __C, float __D)
-{
+static __inline __m512 __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_ps(float __A, float __B, float __C, float __D) {
   return __extension__ (__m512)
    { __D, __C, __B, __A, __D, __C, __B, __A,
      __D, __C, __B, __A, __D, __C, __B, __A };
@@ -9191,18 +9186,18 @@ _mm512_mask_set1_epi64 (__m512i __O, __mmask8 __M, long long __A)
                                               (__v8di) __O);
 }
 
-static  __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi8 (char __e63, char __e62, char __e61, char __e60, char __e59,
-    char __e58, char __e57, char __e56, char __e55, char __e54, char __e53,
-    char __e52, char __e51, char __e50, char __e49, char __e48, char __e47,
-    char __e46, char __e45, char __e44, char __e43, char __e42, char __e41,
-    char __e40, char __e39, char __e38, char __e37, char __e36, char __e35,
-    char __e34, char __e33, char __e32, char __e31, char __e30, char __e29,
-    char __e28, char __e27, char __e26, char __e25, char __e24, char __e23,
-    char __e22, char __e21, char __e20, char __e19, char __e18, char __e17,
-    char __e16, char __e15, char __e14, char __e13, char __e12, char __e11,
-    char __e10, char __e9, char __e8, char __e7, char __e6, char __e5,
-    char __e4, char __e3, char __e2, char __e1, char __e0) {
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi8(
+    char __e63, char __e62, char __e61, char __e60, char __e59, char __e58,
+    char __e57, char __e56, char __e55, char __e54, char __e53, char __e52,
+    char __e51, char __e50, char __e49, char __e48, char __e47, char __e46,
+    char __e45, char __e44, char __e43, char __e42, char __e41, char __e40,
+    char __e39, char __e38, char __e37, char __e36, char __e35, char __e34,
+    char __e33, char __e32, char __e31, char __e30, char __e29, char __e28,
+    char __e27, char __e26, char __e25, char __e24, char __e23, char __e22,
+    char __e21, char __e20, char __e19, char __e18, char __e17, char __e16,
+    char __e15, char __e14, char __e13, char __e12, char __e11, char __e10,
+    char __e9, char __e8, char __e7, char __e6, char __e5, char __e4, char __e3,
+    char __e2, char __e1, char __e0) {
 
   return __extension__ (__m512i)(__v64qi)
     {__e0, __e1, __e2, __e3, __e4, __e5, __e6, __e7,
@@ -9215,14 +9210,13 @@ _mm512_set_epi8 (char __e63, char __e62, char __e61, char __e60, char __e59,
      __e56, __e57, __e58, __e59, __e60, __e61, __e62, __e63};
 }
 
-static  __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi16(short __e31, short __e30, short __e29, short __e28,
-    short __e27, short __e26, short __e25, short __e24, short __e23,
-    short __e22, short __e21, short __e20, short __e19, short __e18,
-    short __e17, short __e16, short __e15, short __e14, short __e13,
-    short __e12, short __e11, short __e10, short __e9, short __e8,
-    short __e7, short __e6, short __e5, short __e4, short __e3,
-    short __e2, short __e1, short __e0) {
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi16(
+    short __e31, short __e30, short __e29, short __e28, short __e27,
+    short __e26, short __e25, short __e24, short __e23, short __e22,
+    short __e21, short __e20, short __e19, short __e18, short __e17,
+    short __e16, short __e15, short __e14, short __e13, short __e12,
+    short __e11, short __e10, short __e9, short __e8, short __e7, short __e6,
+    short __e5, short __e4, short __e3, short __e2, short __e1, short __e0) {
   return __extension__ (__m512i)(__v32hi)
     {__e0, __e1, __e2, __e3, __e4, __e5, __e6, __e7,
      __e8, __e9, __e10, __e11, __e12, __e13, __e14, __e15,
@@ -9230,12 +9224,9 @@ _mm512_set_epi16(short __e31, short __e30, short __e29, short __e28,
      __e24, __e25, __e26, __e27, __e28, __e29, __e30, __e31 };
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi32 (int __A, int __B, int __C, int __D,
-     int __E, int __F, int __G, int __H,
-     int __I, int __J, int __K, int __L,
-     int __M, int __N, int __O, int __P)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi32(
+    int __A, int __B, int __C, int __D, int __E, int __F, int __G, int __H,
+    int __I, int __J, int __K, int __L, int __M, int __N, int __O, int __P) {
   return __extension__ (__m512i)(__v16si)
   { __P, __O, __N, __M, __L, __K, __J, __I,
     __H, __G, __F, __E, __D, __C, __B, __A };
@@ -9246,11 +9237,9 @@ _mm512_set_epi32 (int __A, int __B, int __C, int __D,
   _mm512_set_epi32((e15),(e14),(e13),(e12),(e11),(e10),(e9),(e8),(e7),(e6), \
                    (e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi64 (long long __A, long long __B, long long __C,
-     long long __D, long long __E, long long __F,
-     long long __G, long long __H)
-{
+static __inline__ __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_epi64(long long __A, long long __B, long long __C, long long __D,
+                 long long __E, long long __F, long long __G, long long __H) {
   return __extension__ (__m512i) (__v8di)
   { __H, __G, __F, __E, __D, __C, __B, __A };
 }
@@ -9258,10 +9247,9 @@ _mm512_set_epi64 (long long __A, long long __B, long long __C,
 #define _mm512_setr_epi64(e0,e1,e2,e3,e4,e5,e6,e7)           \
   _mm512_set_epi64((e7),(e6),(e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512d __DEFAULT_FN_ATTRS512
-_mm512_set_pd (double __A, double __B, double __C, double __D,
-        double __E, double __F, double __G, double __H)
-{
+static __inline__ __m512d __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_pd(double __A, double __B, double __C, double __D, double __E,
+              double __F, double __G, double __H) {
   return __extension__ (__m512d)
   { __H, __G, __F, __E, __D, __C, __B, __A };
 }
@@ -9269,12 +9257,10 @@ _mm512_set_pd (double __A, double __B, double __C, double __D,
 #define _mm512_setr_pd(e0,e1,e2,e3,e4,e5,e6,e7)              \
   _mm512_set_pd((e7),(e6),(e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512 __DEFAULT_FN_ATTRS512
-_mm512_set_ps (float __A, float __B, float __C, float __D,
-        float __E, float __F, float __G, float __H,
-        float __I, float __J, float __K, float __L,
-        float __M, float __N, float __O, float __P)
-{
+static __inline__ __m512 __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_ps(float __A, float __B, float __C, float __D, float __E, float __F,
+              float __G, float __H, float __I, float __J, float __K, float __L,
+              float __M, float __N, float __O, float __P) {
   return __extension__ (__m512)
   { __P, __O, __N, __M, __L, __K, __J, __I,
     __H, __G, __F, __E, __D, __C, __B, __A };
diff --git a/clang/lib/Headers/avx512fp16intrin.h b/clang/lib/Headers/avx512fp16intrin.h
index 92df320b45006..e1b198b7fc5c9 100644
--- a/clang/lib/Headers/avx512fp16intrin.h
+++ b/clang/lib/Headers/avx512fp16intrin.h
@@ -33,15 +33,26 @@ typedef _Float16 __m512h_u __attribute__((__vector_size__(64), __aligned__(1)));
                  __target__("avx512fp16,no-evex512"),                          \
                  __min_vector_width__(128)))
 
+#if defined(__cplusplus) && (__cplusplus >= 201103L)
+#define __DEFAULT_FN_ATTRS512_CONSTEXPR __DEFAULT_FN_ATTRS512 constexpr
+#define __DEFAULT_FN_ATTRS256_CONSTEXPR __DEFAULT_FN_ATTRS256 constexpr
+#define __DEFAULT_FN_ATTRS128_CONSTEXPR __DEFAULT_FN_ATTRS128 constexpr
+#else
+#define __DEFAULT_FN_ATTRS512_CONSTEXPR __DEFAULT_FN_ATTRS512
+#define __DEFAULT_FN_ATTRS256_CONSTEXPR __DEFAULT_FN_ATTRS256
+#define __DEFAULT_FN_ATTRS128_CONSTEXPR __DEFAULT_FN_ATTRS128
+#endif
+
 static __inline__ _Float16 __DEFAULT_FN_ATTRS512 _mm512_cvtsh_h(__m512h __a) {
   return __a[0];
 }
 
-static __inline __m128h __DEFAULT_FN_ATTRS128 _mm_setzero_ph(void) {
+static __inline __m128h __DEFAULT_FN_ATTRS128_CONSTEXPR _mm_setzero_ph(void) {
   return (__m128h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
 }
 
-static __inline __m256h __DEFAULT_FN_ATTRS256 _mm256_setzero_ph(void) {
+static __inline __m256h __DEFAULT_FN_ATTRS256_CONSTEXPR
+_mm256_setzero_ph(void) {
   return (__m256h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
 }
@@ -50,7 +61,8 @@ static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_undefined_ph(void) {
   return (__m256h)__builtin_ia32_undef256();
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512 _mm512_setzero_ph(void) {
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_setzero_ph(void) {
   return (__m512h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
@@ -64,14 +76,15 @@ static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_undefined_ph(void) {
   return (__m512h)__builtin_ia32_undef512();
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512 _mm512_set1_ph(_Float16 __h) {
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set1_ph(_Float16 __h) {
   return (__m512h)(__v32hf){__h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h};
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
 _mm512_set_ph(_Float16 __h1, _Float16 __h2, _Float16 __h3, _Float16 __h4,
               _Float16 __h5, _Float16 __h6, _Float16 __h7, _Float16 __h8,
               _Float16 __h9, _Float16 __h10, _Float16 __h11, _Float16 __h12,
@@ -3348,6 +3361,9 @@ _mm512_permutexvar_ph(__m512i __A, __m512h __B) {
 #undef __DEFAULT_FN_ATTRS128
 #undef __DEFAULT_FN_ATTRS256
 #undef __DEFAULT_FN_ATTRS512
+#undef __DEFAULT_FN_ATTRS128_CONSTEXPR
+#undef __DEFAULT_FN_ATTRS256_CONSTEXPR
+#undef __DEFAULT_FN_ATTRS512_CONSTEXPR
 
 #endif
 #endif
diff --git a/clang/test/CodeGen/X86/avx512f-builtins.c b/clang/test/CodeGen/X86/avx512f-builtins.c
index 8c446f5715c00..6b82ba51ffc48 100644
--- a/clang/test/CodeGen/X86/avx512f-builtins.c
+++ b/clang/test/CodeGen/X86/avx512f-builtins.c
@@ -10276,6 +10276,11 @@ __m512i test_mm512_set_epi8(char e63, char e62, char e61, char e60, char e59,
       e25, e24, e23, e22, e21, e20, e19, e18, e17, e16, e15, e14, e13, e12,
       e11, e10, e9, e8, e7, e6, e5, e4, e3, e2, e1, e0);
 }
+TEST_CONSTEXPR(match_v64qi(_mm512_set_epi8(63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, 49, 48,
+                                          47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 
+                                          31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
+                                          15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+                                          32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63));
 
 __m512i test_mm512_set_epi16(short e31, short e30, short e29, short e28,
     short e27, short e26, short e25, short e24, short e23, short e22,
@@ -10319,8 +10324,9 @@ __m512i test_mm512_set_epi16(short e31, short e30, short e29, short e28,
   return _mm512_set_epi16(e31, e30, e29, e28, e27, e26, e25, e24, e23, e22,
       e21, e20, e19, e18, e17, e16, e15, e14, e13, e12, e11, e10, e9, e8, e7,
       e6, e5, e4, e3, e2, e1, e0);
-
 }
+TEST_CONSTEXPR(match_v32hi(_mm512_set_epi16(-31, 30, -29, 28, -27, 26, -25, 24, -23, 22, -21, 20, -19, 18, -17, 16, -15, 14, -13, 12, -11, 10, -9, 8, -7, 6, -5, 4, -3, 2, -1, 0), 0, -1, 2, -3, 4, -5, 6, -7, 8, -9, 10, -11, 12, -13, 14, -15, 16, -17, 18, -19, 20, -21, 22, -23, 24, -25, 26, -27, 28, -29, 30, -31));
+
 __m512i test_mm512_set_epi32 (int __A, int __B, int __C, int __D,
                int __E, int __F, int __G, int __H,
                int __I, int __J, int __K, int __L,
@@ -10346,6 +10352,7 @@ __m512i test_mm512_set_epi32 (int __A, int __B, int __C, int __D,
  return _mm512_set_epi32( __A, __B, __C, __D,__E, __F, __G, __H,
               __I, __J, __K, __L,__M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_v16si(_mm512_set_epi32(-15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0), 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15));
 
 __m512i test_mm512_setr_epi32 (int __A, int __B, int __C, int __D,
                int __E, int __F, int __G, int __H,
@@ -10388,6 +10395,7 @@ __m512i test_mm512_setr_epi32 (int __A, int __B, int __C, int __D,
  return _mm512_setr_epi32( __A, __B, __C, __D,__E, __F, __G, __H,
               __I, __J, __K, __L,__M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_v16si(_mm512_setr_epi32(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
 
 __m512i test_mm512_mask_set1_epi64 (__m512i __O, __mmask8 __M, long long __A)
 {
@@ -10435,6 +10443,7 @@ __m512i test_mm512_set_epi64 (long long __A, long long __B, long long __C,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_set_epi64(__A, __B, __C, __D, __E, __F, __G, __H );
 }
+TEST_CONSTEXPR(match_v8di(_mm512_set_epi64(-15, 13, -11, 9, -7, 5, -3, 1),  1, -3, 5, -7, 9, -11, 13, -15));
 
 __m512i test_mm512_setr_epi64 (long long __A, long long __B, long long __C,
                               long long __D, long long __E, long long __F,
@@ -10459,6 +10468,7 @@ __m512i test_mm512_setr_epi64 (long long __A, long long __B, long long __C,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_setr_epi64(__A, __B, __C, __D, __E, __F, __G, __H );
 }
+TEST_CONSTEXPR(match_v8di(_mm512_setr_epi64(-1, 3, -5, 7, -9, 11, -13, 15), -1, 3, -5, 7, -9, 11, -13, 15));
 
 __m512d test_mm512_set_pd (double __A, double __B, double __C, double __D,
                            double __E, double __F, double __G, double __H)
@@ -10474,6 +10484,7 @@ __m512d test_mm512_set_pd (double __A, double __B, double __C, double __D,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_set_pd( __A, __B, __C, __D, __E, __F, __G, __H);
 }
+TEST_CONSTEXPR(match_m512d(_mm512_set_pd(20.0, 40.0, 60.0, 80.0, 100.0, 120.0, 140.0, 160.0), 160.0, 140.0, 120.0, 100.0, 80.0, 60.0, 40.0, 20.0));
 
 __m512d test_mm512_setr_pd (double __A, double __B, double __C, double __D,
                            double __E, double __F, double __G, double __H)
@@ -10497,6 +10508,7 @@ __m512d test_mm512_setr_pd (double __A, double __B, double __C, double __D,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_setr_pd( __A, __B, __C, __D, __E, __F, __G, __H);
 }
+TEST_CONSTEXPR(match_m512d(_mm512_setr_pd(-20.0, 40.0, -60.0, 80.0, -100.0, 120.0, -140.0, 160.0), -20.0, 40.0, -60.0, 80.0, -100.0, 120.0, -140.0, 160.0));
 
 __m512 test_mm512_set_ps (float __A, float __B, float __C, float __D,
                           float __E, float __F, float __G, float __H,
@@ -10523,6 +10535,7 @@ __m512 test_mm512_set_ps (float __A, float __B, float __C, float __D,
     return _mm512_set_ps( __A, __B, __C, __D, __E, __F, __G, __H,
                           __I, __J, __K, __L, __M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_m512(_mm512_set_ps(-16.0f, 15.0f, -14.0f, 13.0f, -12.0f, 11.0f, -10.0f, 9.0f, -8.0f, 7.0f, -6.0f, 5.0f, -4.0f, 3.0f, -2.0f, 1.0f), 1.0f, -2.0f, 3.0f, -4.0f, 5.0f, -6.0f, 7.0f, -8.0f, 9.0f, -10.0f, 11.0f, -12.0f, 13.0f, -14.0f, 15.0f, -16.0f));
 
 __m512i test_mm512_mask_abs_epi64 (__m512i __W, __mmask8 __U, __m512i __A)
 {
@@ -10601,6 +10614,7 @@ __m512 test_mm512_setr_ps (float __A, float __B, float __C, float __D,
     return _mm512_setr_ps( __A, __B, __C, __D, __E, __F, __G, __H,
                           __I, __J, __K, __L, __M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_m512(_mm512_setr_ps(-1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f, -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f), -1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f, -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f));
 
 int test_mm_cvtss_i32(__m128 A) {
   // CHECK-LABEL: test_mm_cvtss_i32
diff --git a/clang/test/CodeGen/X86/avx512fp16-builtins.c b/clang/test/CodeGen/X86/avx512fp16-builtins.c
index d277d053147fd..f2b5bd0e2f763 100644
--- a/clang/test/CodeGen/X86/avx512fp16-builtins.c
+++ b/clang/test/CodeGen/X86/avx512fp16-builtins.c
@@ -1,6 +1,7 @@
 // RUN: %clang_cc1 -ffreestanding -flax-vector-conversions=none %s -triple=x86_64-unknown-unknown -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror | FileCheck %s
 
 #include <immintrin.h>
+#include "builtin_test_helpers.h"
 
 _Float16 test_mm512_cvtsh_h(__m512h __A) {
   // CHECK-LABEL: @test_mm512_cvtsh_h
@@ -13,12 +14,14 @@ __m128h test_mm_setzero_ph(void) {
   // CHECK: zeroinitializer
   return _mm_setzero_ph();
 }
+TEST_CONSTEXPR(match_m128(_mm_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m256h test_mm256_setzero_ph(void) {
   // CHECK-LABEL: @test_mm256_setzero_ph
   // CHECK: zeroinitializer
   return _mm256_setzero_ph();
 }
+TEST_CONSTEXPR(match_m256(_mm256_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m256h test_mm256_undefined_ph(void) {
   // CHECK-LABEL: @test_mm256_undefined_ph
@@ -31,6 +34,7 @@ __m512h test_mm512_setzero_ph(void) {
   // CHECK: zeroinitializer
   return _mm512_setzero_ph();
 }
+TEST_CONSTEXPR(match_m512(_mm512_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m128h test_mm_undefined_ph(void) {
   // CHECK-LABEL: @test_mm_undefined_ph
@@ -80,6 +84,7 @@ __m512h test_mm512_set1_ph(_Float16 h) {
   // CHECK: insertelement <32 x half> {{.*}}, i32 31
   return _mm512_set1_ph(h);
 }
+TEST_CONSTEXPR(match_m512(_mm512_set1_ph(-101.0f), -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f));
 
 __m512h test_mm512_set1_pch(_Float16 _Complex h) {
   // CHECK-LABEL: @test_mm512_set1_pch
@@ -149,6 +154,13 @@ __m512h test_mm512_set_ph(_Float16 __h1, _Float16 __h2, _Float16 __h3, _Float16
                        __h17, __h18, __h19, __h20, __h21, __h22, __h23, __h24,
                        __h25, __h26, __h27, __h28, __h29, __h30, __h31, __h32);
 }
+TEST_CONSTEXPR(match_mm512(_mm512_set_ph(-1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f,
+                                        -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f,
+                                        -17.0f, 18.0f, -19.0f, 20.0f, -21.0f, 22.0f, -23.0f, 24.0f,
+     ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Aug 10, 2025

@llvm/pr-subscribers-backend-x86

Author: Bhasawut Singhaphan (markbhasawut)

Changes

This PR adds constexpr support for the following AVX512F/BW/FP16 set intrinsics:

  • _mm512_set4_pd

  • _mm512_set4_ps

  • _mm512_set4_epi32

  • _mm512_set4_epi64

  • _mm512_set_pd

  • _mm512_set_ps

  • _mm512_set_epi8

  • _mm512_set_epi16

  • _mm512_set_epi32

  • _mm512_set_epi64

  • _mm512_setr_pd

  • _mm512_setr_ps

  • _mm512_setr_epi32

  • _mm512_setr_epi64

  • _mm512_set1_ph

  • _mm512_set_ph

  • _mm512_setr_ph

  • _mm_setzero_ph

  • _mm256_setzero_ph

  • _mm512_setzero_ph

Closes #152288.
Part of #30794.


Patch is 21.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/152910.diff

4 Files Affected:

  • (modified) clang/lib/Headers/avx512fintrin.h (+40-54)
  • (modified) clang/lib/Headers/avx512fp16intrin.h (+21-5)
  • (modified) clang/test/CodeGen/X86/avx512f-builtins.c (+15-1)
  • (modified) clang/test/CodeGen/X86/avx512fp16-builtins.c (+20)
diff --git a/clang/lib/Headers/avx512fintrin.h b/clang/lib/Headers/avx512fintrin.h
index 95b80cc59bb02..327a044f369ce 100644
--- a/clang/lib/Headers/avx512fintrin.h
+++ b/clang/lib/Headers/avx512fintrin.h
@@ -351,32 +351,27 @@ _mm512_broadcastss_ps(__m128 __A)
                                          0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set4_epi32 (int __A, int __B, int __C, int __D)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_epi32(int __A, int __B, int __C, int __D) {
   return __extension__ (__m512i)(__v16si)
    { __D, __C, __B, __A, __D, __C, __B, __A,
      __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set4_epi64 (long long __A, long long __B, long long __C,
-       long long __D)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_epi64(long long __A, long long __B, long long __C, long long __D) {
   return __extension__ (__m512i) (__v8di)
    { __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512d __DEFAULT_FN_ATTRS512
-_mm512_set4_pd (double __A, double __B, double __C, double __D)
-{
+static __inline __m512d __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_pd(double __A, double __B, double __C, double __D) {
   return __extension__ (__m512d)
    { __D, __C, __B, __A, __D, __C, __B, __A };
 }
 
-static __inline __m512 __DEFAULT_FN_ATTRS512
-_mm512_set4_ps (float __A, float __B, float __C, float __D)
-{
+static __inline __m512 __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set4_ps(float __A, float __B, float __C, float __D) {
   return __extension__ (__m512)
    { __D, __C, __B, __A, __D, __C, __B, __A,
      __D, __C, __B, __A, __D, __C, __B, __A };
@@ -9191,18 +9186,18 @@ _mm512_mask_set1_epi64 (__m512i __O, __mmask8 __M, long long __A)
                                               (__v8di) __O);
 }
 
-static  __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi8 (char __e63, char __e62, char __e61, char __e60, char __e59,
-    char __e58, char __e57, char __e56, char __e55, char __e54, char __e53,
-    char __e52, char __e51, char __e50, char __e49, char __e48, char __e47,
-    char __e46, char __e45, char __e44, char __e43, char __e42, char __e41,
-    char __e40, char __e39, char __e38, char __e37, char __e36, char __e35,
-    char __e34, char __e33, char __e32, char __e31, char __e30, char __e29,
-    char __e28, char __e27, char __e26, char __e25, char __e24, char __e23,
-    char __e22, char __e21, char __e20, char __e19, char __e18, char __e17,
-    char __e16, char __e15, char __e14, char __e13, char __e12, char __e11,
-    char __e10, char __e9, char __e8, char __e7, char __e6, char __e5,
-    char __e4, char __e3, char __e2, char __e1, char __e0) {
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi8(
+    char __e63, char __e62, char __e61, char __e60, char __e59, char __e58,
+    char __e57, char __e56, char __e55, char __e54, char __e53, char __e52,
+    char __e51, char __e50, char __e49, char __e48, char __e47, char __e46,
+    char __e45, char __e44, char __e43, char __e42, char __e41, char __e40,
+    char __e39, char __e38, char __e37, char __e36, char __e35, char __e34,
+    char __e33, char __e32, char __e31, char __e30, char __e29, char __e28,
+    char __e27, char __e26, char __e25, char __e24, char __e23, char __e22,
+    char __e21, char __e20, char __e19, char __e18, char __e17, char __e16,
+    char __e15, char __e14, char __e13, char __e12, char __e11, char __e10,
+    char __e9, char __e8, char __e7, char __e6, char __e5, char __e4, char __e3,
+    char __e2, char __e1, char __e0) {
 
   return __extension__ (__m512i)(__v64qi)
     {__e0, __e1, __e2, __e3, __e4, __e5, __e6, __e7,
@@ -9215,14 +9210,13 @@ _mm512_set_epi8 (char __e63, char __e62, char __e61, char __e60, char __e59,
      __e56, __e57, __e58, __e59, __e60, __e61, __e62, __e63};
 }
 
-static  __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi16(short __e31, short __e30, short __e29, short __e28,
-    short __e27, short __e26, short __e25, short __e24, short __e23,
-    short __e22, short __e21, short __e20, short __e19, short __e18,
-    short __e17, short __e16, short __e15, short __e14, short __e13,
-    short __e12, short __e11, short __e10, short __e9, short __e8,
-    short __e7, short __e6, short __e5, short __e4, short __e3,
-    short __e2, short __e1, short __e0) {
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi16(
+    short __e31, short __e30, short __e29, short __e28, short __e27,
+    short __e26, short __e25, short __e24, short __e23, short __e22,
+    short __e21, short __e20, short __e19, short __e18, short __e17,
+    short __e16, short __e15, short __e14, short __e13, short __e12,
+    short __e11, short __e10, short __e9, short __e8, short __e7, short __e6,
+    short __e5, short __e4, short __e3, short __e2, short __e1, short __e0) {
   return __extension__ (__m512i)(__v32hi)
     {__e0, __e1, __e2, __e3, __e4, __e5, __e6, __e7,
      __e8, __e9, __e10, __e11, __e12, __e13, __e14, __e15,
@@ -9230,12 +9224,9 @@ _mm512_set_epi16(short __e31, short __e30, short __e29, short __e28,
      __e24, __e25, __e26, __e27, __e28, __e29, __e30, __e31 };
 }
 
-static __inline __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi32 (int __A, int __B, int __C, int __D,
-     int __E, int __F, int __G, int __H,
-     int __I, int __J, int __K, int __L,
-     int __M, int __N, int __O, int __P)
-{
+static __inline __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_set_epi32(
+    int __A, int __B, int __C, int __D, int __E, int __F, int __G, int __H,
+    int __I, int __J, int __K, int __L, int __M, int __N, int __O, int __P) {
   return __extension__ (__m512i)(__v16si)
   { __P, __O, __N, __M, __L, __K, __J, __I,
     __H, __G, __F, __E, __D, __C, __B, __A };
@@ -9246,11 +9237,9 @@ _mm512_set_epi32 (int __A, int __B, int __C, int __D,
   _mm512_set_epi32((e15),(e14),(e13),(e12),(e11),(e10),(e9),(e8),(e7),(e6), \
                    (e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512i __DEFAULT_FN_ATTRS512
-_mm512_set_epi64 (long long __A, long long __B, long long __C,
-     long long __D, long long __E, long long __F,
-     long long __G, long long __H)
-{
+static __inline__ __m512i __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_epi64(long long __A, long long __B, long long __C, long long __D,
+                 long long __E, long long __F, long long __G, long long __H) {
   return __extension__ (__m512i) (__v8di)
   { __H, __G, __F, __E, __D, __C, __B, __A };
 }
@@ -9258,10 +9247,9 @@ _mm512_set_epi64 (long long __A, long long __B, long long __C,
 #define _mm512_setr_epi64(e0,e1,e2,e3,e4,e5,e6,e7)           \
   _mm512_set_epi64((e7),(e6),(e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512d __DEFAULT_FN_ATTRS512
-_mm512_set_pd (double __A, double __B, double __C, double __D,
-        double __E, double __F, double __G, double __H)
-{
+static __inline__ __m512d __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_pd(double __A, double __B, double __C, double __D, double __E,
+              double __F, double __G, double __H) {
   return __extension__ (__m512d)
   { __H, __G, __F, __E, __D, __C, __B, __A };
 }
@@ -9269,12 +9257,10 @@ _mm512_set_pd (double __A, double __B, double __C, double __D,
 #define _mm512_setr_pd(e0,e1,e2,e3,e4,e5,e6,e7)              \
   _mm512_set_pd((e7),(e6),(e5),(e4),(e3),(e2),(e1),(e0))
 
-static __inline__ __m512 __DEFAULT_FN_ATTRS512
-_mm512_set_ps (float __A, float __B, float __C, float __D,
-        float __E, float __F, float __G, float __H,
-        float __I, float __J, float __K, float __L,
-        float __M, float __N, float __O, float __P)
-{
+static __inline__ __m512 __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set_ps(float __A, float __B, float __C, float __D, float __E, float __F,
+              float __G, float __H, float __I, float __J, float __K, float __L,
+              float __M, float __N, float __O, float __P) {
   return __extension__ (__m512)
   { __P, __O, __N, __M, __L, __K, __J, __I,
     __H, __G, __F, __E, __D, __C, __B, __A };
diff --git a/clang/lib/Headers/avx512fp16intrin.h b/clang/lib/Headers/avx512fp16intrin.h
index 92df320b45006..e1b198b7fc5c9 100644
--- a/clang/lib/Headers/avx512fp16intrin.h
+++ b/clang/lib/Headers/avx512fp16intrin.h
@@ -33,15 +33,26 @@ typedef _Float16 __m512h_u __attribute__((__vector_size__(64), __aligned__(1)));
                  __target__("avx512fp16,no-evex512"),                          \
                  __min_vector_width__(128)))
 
+#if defined(__cplusplus) && (__cplusplus >= 201103L)
+#define __DEFAULT_FN_ATTRS512_CONSTEXPR __DEFAULT_FN_ATTRS512 constexpr
+#define __DEFAULT_FN_ATTRS256_CONSTEXPR __DEFAULT_FN_ATTRS256 constexpr
+#define __DEFAULT_FN_ATTRS128_CONSTEXPR __DEFAULT_FN_ATTRS128 constexpr
+#else
+#define __DEFAULT_FN_ATTRS512_CONSTEXPR __DEFAULT_FN_ATTRS512
+#define __DEFAULT_FN_ATTRS256_CONSTEXPR __DEFAULT_FN_ATTRS256
+#define __DEFAULT_FN_ATTRS128_CONSTEXPR __DEFAULT_FN_ATTRS128
+#endif
+
 static __inline__ _Float16 __DEFAULT_FN_ATTRS512 _mm512_cvtsh_h(__m512h __a) {
   return __a[0];
 }
 
-static __inline __m128h __DEFAULT_FN_ATTRS128 _mm_setzero_ph(void) {
+static __inline __m128h __DEFAULT_FN_ATTRS128_CONSTEXPR _mm_setzero_ph(void) {
   return (__m128h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
 }
 
-static __inline __m256h __DEFAULT_FN_ATTRS256 _mm256_setzero_ph(void) {
+static __inline __m256h __DEFAULT_FN_ATTRS256_CONSTEXPR
+_mm256_setzero_ph(void) {
   return (__m256h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
 }
@@ -50,7 +61,8 @@ static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_undefined_ph(void) {
   return (__m256h)__builtin_ia32_undef256();
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512 _mm512_setzero_ph(void) {
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_setzero_ph(void) {
   return (__m512h){0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
                    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
@@ -64,14 +76,15 @@ static __inline__ __m512h __DEFAULT_FN_ATTRS512 _mm512_undefined_ph(void) {
   return (__m512h)__builtin_ia32_undef512();
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512 _mm512_set1_ph(_Float16 __h) {
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
+_mm512_set1_ph(_Float16 __h) {
   return (__m512h)(__v32hf){__h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h,
                             __h, __h, __h, __h, __h, __h, __h, __h};
 }
 
-static __inline __m512h __DEFAULT_FN_ATTRS512
+static __inline __m512h __DEFAULT_FN_ATTRS512_CONSTEXPR
 _mm512_set_ph(_Float16 __h1, _Float16 __h2, _Float16 __h3, _Float16 __h4,
               _Float16 __h5, _Float16 __h6, _Float16 __h7, _Float16 __h8,
               _Float16 __h9, _Float16 __h10, _Float16 __h11, _Float16 __h12,
@@ -3348,6 +3361,9 @@ _mm512_permutexvar_ph(__m512i __A, __m512h __B) {
 #undef __DEFAULT_FN_ATTRS128
 #undef __DEFAULT_FN_ATTRS256
 #undef __DEFAULT_FN_ATTRS512
+#undef __DEFAULT_FN_ATTRS128_CONSTEXPR
+#undef __DEFAULT_FN_ATTRS256_CONSTEXPR
+#undef __DEFAULT_FN_ATTRS512_CONSTEXPR
 
 #endif
 #endif
diff --git a/clang/test/CodeGen/X86/avx512f-builtins.c b/clang/test/CodeGen/X86/avx512f-builtins.c
index 8c446f5715c00..6b82ba51ffc48 100644
--- a/clang/test/CodeGen/X86/avx512f-builtins.c
+++ b/clang/test/CodeGen/X86/avx512f-builtins.c
@@ -10276,6 +10276,11 @@ __m512i test_mm512_set_epi8(char e63, char e62, char e61, char e60, char e59,
       e25, e24, e23, e22, e21, e20, e19, e18, e17, e16, e15, e14, e13, e12,
       e11, e10, e9, e8, e7, e6, e5, e4, e3, e2, e1, e0);
 }
+TEST_CONSTEXPR(match_v64qi(_mm512_set_epi8(63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, 49, 48,
+                                          47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 
+                                          31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
+                                          15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+                                          32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63));
 
 __m512i test_mm512_set_epi16(short e31, short e30, short e29, short e28,
     short e27, short e26, short e25, short e24, short e23, short e22,
@@ -10319,8 +10324,9 @@ __m512i test_mm512_set_epi16(short e31, short e30, short e29, short e28,
   return _mm512_set_epi16(e31, e30, e29, e28, e27, e26, e25, e24, e23, e22,
       e21, e20, e19, e18, e17, e16, e15, e14, e13, e12, e11, e10, e9, e8, e7,
       e6, e5, e4, e3, e2, e1, e0);
-
 }
+TEST_CONSTEXPR(match_v32hi(_mm512_set_epi16(-31, 30, -29, 28, -27, 26, -25, 24, -23, 22, -21, 20, -19, 18, -17, 16, -15, 14, -13, 12, -11, 10, -9, 8, -7, 6, -5, 4, -3, 2, -1, 0), 0, -1, 2, -3, 4, -5, 6, -7, 8, -9, 10, -11, 12, -13, 14, -15, 16, -17, 18, -19, 20, -21, 22, -23, 24, -25, 26, -27, 28, -29, 30, -31));
+
 __m512i test_mm512_set_epi32 (int __A, int __B, int __C, int __D,
                int __E, int __F, int __G, int __H,
                int __I, int __J, int __K, int __L,
@@ -10346,6 +10352,7 @@ __m512i test_mm512_set_epi32 (int __A, int __B, int __C, int __D,
  return _mm512_set_epi32( __A, __B, __C, __D,__E, __F, __G, __H,
               __I, __J, __K, __L,__M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_v16si(_mm512_set_epi32(-15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0), 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15));
 
 __m512i test_mm512_setr_epi32 (int __A, int __B, int __C, int __D,
                int __E, int __F, int __G, int __H,
@@ -10388,6 +10395,7 @@ __m512i test_mm512_setr_epi32 (int __A, int __B, int __C, int __D,
  return _mm512_setr_epi32( __A, __B, __C, __D,__E, __F, __G, __H,
               __I, __J, __K, __L,__M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_v16si(_mm512_setr_epi32(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
 
 __m512i test_mm512_mask_set1_epi64 (__m512i __O, __mmask8 __M, long long __A)
 {
@@ -10435,6 +10443,7 @@ __m512i test_mm512_set_epi64 (long long __A, long long __B, long long __C,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_set_epi64(__A, __B, __C, __D, __E, __F, __G, __H );
 }
+TEST_CONSTEXPR(match_v8di(_mm512_set_epi64(-15, 13, -11, 9, -7, 5, -3, 1),  1, -3, 5, -7, 9, -11, 13, -15));
 
 __m512i test_mm512_setr_epi64 (long long __A, long long __B, long long __C,
                               long long __D, long long __E, long long __F,
@@ -10459,6 +10468,7 @@ __m512i test_mm512_setr_epi64 (long long __A, long long __B, long long __C,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_setr_epi64(__A, __B, __C, __D, __E, __F, __G, __H );
 }
+TEST_CONSTEXPR(match_v8di(_mm512_setr_epi64(-1, 3, -5, 7, -9, 11, -13, 15), -1, 3, -5, 7, -9, 11, -13, 15));
 
 __m512d test_mm512_set_pd (double __A, double __B, double __C, double __D,
                            double __E, double __F, double __G, double __H)
@@ -10474,6 +10484,7 @@ __m512d test_mm512_set_pd (double __A, double __B, double __C, double __D,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_set_pd( __A, __B, __C, __D, __E, __F, __G, __H);
 }
+TEST_CONSTEXPR(match_m512d(_mm512_set_pd(20.0, 40.0, 60.0, 80.0, 100.0, 120.0, 140.0, 160.0), 160.0, 140.0, 120.0, 100.0, 80.0, 60.0, 40.0, 20.0));
 
 __m512d test_mm512_setr_pd (double __A, double __B, double __C, double __D,
                            double __E, double __F, double __G, double __H)
@@ -10497,6 +10508,7 @@ __m512d test_mm512_setr_pd (double __A, double __B, double __C, double __D,
     //CHECK: insertelement{{.*}}i32 7
   return _mm512_setr_pd( __A, __B, __C, __D, __E, __F, __G, __H);
 }
+TEST_CONSTEXPR(match_m512d(_mm512_setr_pd(-20.0, 40.0, -60.0, 80.0, -100.0, 120.0, -140.0, 160.0), -20.0, 40.0, -60.0, 80.0, -100.0, 120.0, -140.0, 160.0));
 
 __m512 test_mm512_set_ps (float __A, float __B, float __C, float __D,
                           float __E, float __F, float __G, float __H,
@@ -10523,6 +10535,7 @@ __m512 test_mm512_set_ps (float __A, float __B, float __C, float __D,
     return _mm512_set_ps( __A, __B, __C, __D, __E, __F, __G, __H,
                           __I, __J, __K, __L, __M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_m512(_mm512_set_ps(-16.0f, 15.0f, -14.0f, 13.0f, -12.0f, 11.0f, -10.0f, 9.0f, -8.0f, 7.0f, -6.0f, 5.0f, -4.0f, 3.0f, -2.0f, 1.0f), 1.0f, -2.0f, 3.0f, -4.0f, 5.0f, -6.0f, 7.0f, -8.0f, 9.0f, -10.0f, 11.0f, -12.0f, 13.0f, -14.0f, 15.0f, -16.0f));
 
 __m512i test_mm512_mask_abs_epi64 (__m512i __W, __mmask8 __U, __m512i __A)
 {
@@ -10601,6 +10614,7 @@ __m512 test_mm512_setr_ps (float __A, float __B, float __C, float __D,
     return _mm512_setr_ps( __A, __B, __C, __D, __E, __F, __G, __H,
                           __I, __J, __K, __L, __M, __N, __O, __P);
 }
+TEST_CONSTEXPR(match_m512(_mm512_setr_ps(-1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f, -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f), -1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f, -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f));
 
 int test_mm_cvtss_i32(__m128 A) {
   // CHECK-LABEL: test_mm_cvtss_i32
diff --git a/clang/test/CodeGen/X86/avx512fp16-builtins.c b/clang/test/CodeGen/X86/avx512fp16-builtins.c
index d277d053147fd..f2b5bd0e2f763 100644
--- a/clang/test/CodeGen/X86/avx512fp16-builtins.c
+++ b/clang/test/CodeGen/X86/avx512fp16-builtins.c
@@ -1,6 +1,7 @@
 // RUN: %clang_cc1 -ffreestanding -flax-vector-conversions=none %s -triple=x86_64-unknown-unknown -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror | FileCheck %s
 
 #include <immintrin.h>
+#include "builtin_test_helpers.h"
 
 _Float16 test_mm512_cvtsh_h(__m512h __A) {
   // CHECK-LABEL: @test_mm512_cvtsh_h
@@ -13,12 +14,14 @@ __m128h test_mm_setzero_ph(void) {
   // CHECK: zeroinitializer
   return _mm_setzero_ph();
 }
+TEST_CONSTEXPR(match_m128(_mm_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m256h test_mm256_setzero_ph(void) {
   // CHECK-LABEL: @test_mm256_setzero_ph
   // CHECK: zeroinitializer
   return _mm256_setzero_ph();
 }
+TEST_CONSTEXPR(match_m256(_mm256_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m256h test_mm256_undefined_ph(void) {
   // CHECK-LABEL: @test_mm256_undefined_ph
@@ -31,6 +34,7 @@ __m512h test_mm512_setzero_ph(void) {
   // CHECK: zeroinitializer
   return _mm512_setzero_ph();
 }
+TEST_CONSTEXPR(match_m512(_mm512_setzero_ph(), +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f, +0.0f));
 
 __m128h test_mm_undefined_ph(void) {
   // CHECK-LABEL: @test_mm_undefined_ph
@@ -80,6 +84,7 @@ __m512h test_mm512_set1_ph(_Float16 h) {
   // CHECK: insertelement <32 x half> {{.*}}, i32 31
   return _mm512_set1_ph(h);
 }
+TEST_CONSTEXPR(match_m512(_mm512_set1_ph(-101.0f), -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f, -101.0f));
 
 __m512h test_mm512_set1_pch(_Float16 _Complex h) {
   // CHECK-LABEL: @test_mm512_set1_pch
@@ -149,6 +154,13 @@ __m512h test_mm512_set_ph(_Float16 __h1, _Float16 __h2, _Float16 __h3, _Float16
                        __h17, __h18, __h19, __h20, __h21, __h22, __h23, __h24,
                        __h25, __h26, __h27, __h28, __h29, __h30, __h31, __h32);
 }
+TEST_CONSTEXPR(match_mm512(_mm512_set_ph(-1.0f, 2.0f, -3.0f, 4.0f, -5.0f, 6.0f, -7.0f, 8.0f,
+                                        -9.0f, 10.0f, -11.0f, 12.0f, -13.0f, 14.0f, -15.0f, 16.0f,
+                                        -17.0f, 18.0f, -19.0f, 20.0f, -21.0f, 22.0f, -23.0f, 24.0f,
+     ...
[truncated]

@markbhasawut markbhasawut force-pushed the x86-constexpr-avx512-set branch 4 times, most recently from 6d9eba4 to ecaab6f Compare August 11, 2025 09:08
@RKSimon RKSimon requested review from RKSimon and phoebewang August 11, 2025 09:09
…texpr

This PR adds constexpr support for the following AVX512F/BW/FP16 set intrinsics:

  - _mm512_set4_pd
  - _mm512_set4_ps
  - _mm512_set4_epi32
  - _mm512_set4_epi64

  - _mm512_set_pd
  - _mm512_set_ps
  - _mm512_set_epi8
  - _mm512_set_epi16
  - _mm512_set_epi32
  - _mm512_set_epi64

  - _mm512_setr_pd
  - _mm512_setr_ps
  - _mm512_setr_epi32
  - _mm512_setr_epi64

  - _mm512_set1_ph
  - _mm512_set_ph
  - _mm512_setr_ph

  - _mm_setzero_ph
  - _mm256_setzero_ph
  - _mm512_setzero_ph

Closes llvm#152288.
Part of llvm#30794.
@markbhasawut markbhasawut force-pushed the x86-constexpr-avx512-set branch from ecaab6f to 9f81cec Compare August 11, 2025 13:02
@markbhasawut markbhasawut marked this pull request as draft August 11, 2025 13:03
@markbhasawut markbhasawut force-pushed the x86-constexpr-avx512-set branch from 245b4d1 to 9b0dd2c Compare August 11, 2025 14:16
@markbhasawut markbhasawut marked this pull request as ready for review August 11, 2025 14:23
@markbhasawut markbhasawut requested a review from RKSimon August 11, 2025 14:41
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@RKSimon RKSimon merged commit 26a0962 into llvm:main Aug 11, 2025
11 checks passed
Copy link

@markbhasawut Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 11, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building clang at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/14946

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp   -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


markbhasawut added a commit to markbhasawut/llvm-project that referenced this pull request Aug 11, 2025
…finition for non-constexpr

While working on llvm#152910, I spotted some incorrectly defined
`__DEFAULT_FN_ATTRS*_CONSTEXPR` macro for non-constexpr context.
markbhasawut added a commit to markbhasawut/llvm-project that referenced this pull request Aug 11, 2025
While working on PR llvm#152910, I discovered some incorrectly defined
`__DEFAULT_FN_ATTRS*_CONSTEXPR` macros for non-constexpr contexts.
phoebewang pushed a commit that referenced this pull request Aug 12, 2025
…3060)

While working on PR #152910, I discovered some incorrectly defined
`__DEFAULT_FN_ATTRS*_CONSTEXPR` macros for non-constexpr contexts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Headers][X86] Allow AVX512 _mm512_set* intrinsics to be used in constexpr
4 participants